Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
Bioinformatics ; 40(2)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38364309

RESUMO

MOTIVATION: Estimating the individual inbreeding coefficient and pairwise kinship is an important problem in human genetics (e.g. in disease mapping) and in animal and plant genetics (e.g. inbreeding design). Existing methods, such as sample correlation-based genetic relationship matrix, KING, and UKin, are either biased, or not able to estimate inbreeding coefficients, or produce a large proportion of negative estimates that are difficult to interpret. This limitation of existing methods is partly due to failure to explicitly model inbreeding. Since all humans are inbred to various degrees by virtue of shared ancestries, it is prudent to account for inbreeding when inferring kinship between individuals. RESULTS: We present "Kindred," an approach that estimates inbreeding and kinship by modeling latent identity-by-descent states that accounts for all possible allele sharing-including inbreeding-between two individuals. Kindred used non-negative least squares method to fit the model, which not only increases computation efficiency compared to the maximum likelihood method, but also guarantees non-negativity of the kinship estimates. Through simulation, we demonstrate the high accuracy and non-negativity of kinship estimates by Kindred. By selecting a subset of SNPs that are similar in allele frequencies across different continental populations, Kindred can accurately estimate kinship between admixed samples. In addition, we demonstrate that the realized kinship matrix estimated by Kindred is effective in reducing genomic control values via linear mixed model in genome-wide association studies. Finally, we demonstrate that Kindred produces sensible heritability estimates on an Australian height dataset. AVAILABILITY AND IMPLEMENTATION: Kindred is implemented in C with multi-threading. It takes vcf file or stream as input and works seamlessly with bcftools. Kindred is freely available at https://github.com/haplotype/kindred.


Assuntos
Estudo de Associação Genômica Ampla , Endogamia , Animais , Humanos , Austrália , Genoma , Frequência do Gene , Linhagem
2.
Res Sq ; 2023 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-37333260

RESUMO

Genome-wide DNA methylation studies have typically focused on quantitative assessments of CpG methylation at individual loci. Although methylation states at nearby CpG sites are known to be highly correlated, suggestive of an underlying coordinated regulatory network, the extent and consistency of inter-CpG methylation correlation across the genome, including variation between individuals, disease states, and tissues, remains unknown. Here, we leverage image conversion of correlation matrices to identify correlated methylation units (CMUs) across the genome, describe their variation across tissues, and annotate their regulatory potential using 35 public Illumina BeadChip datasets spanning more than 12,000 individuals and 26 different tissues. We identified a median of 18,125 CMUs genome-wide, occurring on all chromosomes and spanning a median of ~1 kb. Notably, 50% of CMUs had evidence of long-range correlation with other proximal CMUs. Although the size and number of CMUs varied across datasets, we observed strong intra-tissue consistency among CMUs, with those in testis encompassing those seen in most other tissues. Approximately 20% of CMUs were highly conserved across normal tissues (i.e. tissue independent), with 73 loci demonstrating strong correlation with non-adjacent CMUs on the same chromosome. These loci were enriched for CTCF and transcription factor binding sites, always found within putative TADs, and associated with the B compartment of chromosome folding. Finally, we observed significantly different, but highly consistent, patterns of CMU correlation between diseased and non-diseased states. Our first-generation, genome-wide, DNA methylation map suggests a highly coordinated CMU regulatory network that is sensitive to disruptions in its architecture.

3.
Stat Methods Med Res ; 31(2): 315-333, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34931910

RESUMO

Cocaine addiction is an important public health problem worldwide. Cognitive-behavioral therapy is a counseling intervention for supporting cocaine-dependent individuals through recovery and relapse prevention. It may reduce patients' cocaine uses by improving their motivations and enabling them to recognize risky situations. To study the effect of cognitive behavioral therapy on cocaine dependence, the self-reported cocaine use with urine test data were collected at the Primary Care Center of Yale-New Haven Hospital. Its outcomes are binary, including both the daily self-reported drug uses and weekly urine test results. To date, the generalized estimating equations are widely used to analyze binary data with repeated measures. However, due to the existence of significant self-report bias in the self-reported cocaine use with urine test data, a direct application of the generalized estimating equations approach may not be valid. In this paper, we proposed a novel mean corrected generalized estimating equations approach for analyzing longitudinal binary outcomes subject to reporting bias. The mean corrected generalized estimating equations can provide consistently and asymptotically normally distributed estimators under true contamination probabilities. In the self-reported cocaine use with urine test study, accurate weekly urine test results are used to detect contamination. The superior performances of the proposed method are illustrated by both simulation studies and real data analysis.


Assuntos
Cocaína , Projetos de Pesquisa , Viés , Simulação por Computador , Humanos , Autorrelato
4.
Nat Commun ; 12(1): 510, 2021 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-33479230

RESUMO

Accurate pathogenicity prediction of missense variants is critically important in genetic studies and clinical diagnosis. Previously published prediction methods have facilitated the interpretation of missense variants but have limited performance. Here, we describe MVP (Missense Variant Pathogenicity prediction), a new prediction method that uses deep residual network to leverage large training data sets and many correlated predictors. We train the model separately in genes that are intolerant of loss of function variants and the ones that are tolerant in order to take account of potentially different genetic effect size and mode of action. We compile cancer mutation hotspots and de novo variants from developmental disorders for benchmarking. Overall, MVP achieves better performance in prioritizing pathogenic missense variants than previous methods, especially in genes tolerant of loss of function variants. Finally, using MVP, we estimate that de novo coding variants contribute to 7.8% of isolated congenital heart disease, nearly doubling previous estimates.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Predisposição Genética para Doença/genética , Mutação de Sentido Incorreto , Neoplasias/genética , Algoritmos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Cardiopatias Congênitas/diagnóstico , Cardiopatias Congênitas/genética , Humanos , Neoplasias/diagnóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
5.
Genome Res ; 30(9): 1364-1375, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32883749

RESUMO

We present Nubeam (nucleotide be a matrix) as a novel reference-free approach to analyze short sequencing reads. Nubeam represents nucleotides by matrices, transforms a read into a product of matrices, and assigns numbers to reads based on the product matrix. Nubeam capitalizes on the noncommutative property of matrix multiplication, such that different reads are assigned different numbers and similar reads similar numbers. A sample, which is a collection of reads, becomes a collection of numbers that form an empirical distribution. We demonstrate that the genetic difference between samples can be quantified by the distance between empirical distributions. Nubeam includes the k-mer method as a special case, but unlike the k-mer method, it is convenient for Nubeam to account for GC bias and nucleotide quality. As a reference-free approach, Nubeam avoids reference bias and mapping bias, and can work with organisms without reference genomes. Thus, Nubeam is ideal to analyze data sets from metagenomics whole genome shotgun (WGS) sequencing, where the amount of unmapped reads is substantial. When applied to a WGS sequencing data set to quantify distances between metagenomics samples from various human body habitats, Nubeam recapitulates findings made by mapping-based methods and sheds light on contributions of unmapped reads. Nubeam is also useful in analyzing 16S rRNA sequencing data, which is a more prevalent type of data set in metagenomics studies. In our analysis, Nubeam recapitulated the findings that natural microbiota in mouse gut are resilient under challenges, and Nubeam detected differences in vaginal microbiota between cases of polycystic ovary syndrome and healthy controls.


Assuntos
Metagenômica/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Feminino , Microbioma Gastrointestinal , Humanos , Camundongos , RNA Ribossômico 16S , Análise de Sequência de RNA/métodos , Vagina/microbiologia
6.
Bioinformatics ; 36(10): 3254-3256, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32091581

RESUMO

SUMMARY: We present Nubeam-dedup, a fast and RAM-efficient tool to de-duplicate sequencing reads without reference genome. Nubeam-dedup represents nucleotides by matrices, transforms reads into products of matrices, and based on which assigns a unique number to a read. Thus, duplicate reads can be efficiently removed by using a collisionless hash function. Compared with other state-of-the-art reference-free tools, Nubeam-dedup uses 50-70% of CPU time and 10-15% of RAM. AVAILABILITY AND IMPLEMENTATION: Source code in C++ and manual are available at https://github.com/daihang16/nubeamdedup and https://haplotype.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Algoritmos , Genoma , Análise de Sequência de DNA
7.
Genet Med ; 22(2): 301-308, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31467446

RESUMO

PURPOSE: Fetal fraction (FF) is the percent of cell-free DNA (cfDNA) in the mother's peripheral blood that is of fetal origin, which plays a pivotal role in noninvasive prenatal screening (NIPS). We present a method that can reliably estimate FFs by examining autosome single-nucleotide polymorphisms (SNPs). METHODS: Even at a very low sequencing depth, there are plenty of SNPs covered by more than one read. At those SNPs, we define read heterozygosity and demonstrate that the percent of read heterozygosity is a function of FF, which allows FF to be inferred. RESULTS: We first demonstrated the effectiveness of our method in inferring FF. Then we used the inferred FF as an informative alternative prior to computing Bayes factors to test for aneuploidy, and observed better power than the Z-test. In analysis of clinical samples, we were able to identify female-male twins thanks to the accurate FF inference. CONCLUSION: Knowing FF improves efficacy of NIPS. It brings a powerful Bayesian method, allows "no call" for samples with small FFs, renders screening for XXY syndrome simpler, and permits an adaptive design to sequence at a higher depth for samples with small FFs.


Assuntos
Ácidos Nucleicos Livres/análise , Desenvolvimento Fetal/genética , Teste Pré-Natal não Invasivo/métodos , Aberrações Cromossômicas , Feminino , Feto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Gravidez , Cuidado Pré-Natal , Diagnóstico Pré-Natal/métodos , Análise de Sequência de DNA/métodos
8.
Genet Med ; 22(2): 450, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31822850

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

9.
Nat Commun ; 10(1): 5791, 2019 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-31857576

RESUMO

Edematous severe acute childhood malnutrition (edematous SAM or ESAM), which includes kwashiorkor, presents with more overt multi-organ dysfunction than non-edematous SAM (NESAM). Reduced concentrations and methyl-flux of methionine in 1-carbon metabolism have been reported in acute, but not recovered, ESAM, suggesting downstream DNA methylation changes could be relevant to differences in SAM pathogenesis. Here, we assess genome-wide DNA methylation in buccal cells of 309 SAM children using the 450 K microarray. Relative to NESAM, ESAM is characterized by multiple significantly hypomethylated loci, which is not observed among SAM-recovered adults. Gene expression and methylation show both positive and negative correlation, suggesting a complex transcriptional response to SAM. Hypomethylated loci link to disorders of nutrition and metabolism, including fatty liver and diabetes, and appear to be influenced by genetic variation. Our epigenetic findings provide a potential molecular link to reported aberrant 1-carbon metabolism in ESAM and support consideration of methyl-group supplementation in ESAM.


Assuntos
Metilação de DNA , Epigenoma/genética , Desnutrição Aguda Grave/genética , Adolescente , Adulto , Estudos de Casos e Controles , Pré-Escolar , Ilhas de CpG/genética , Epigenômica/métodos , Feminino , Perfilação da Expressão Gênica , Humanos , Lactente , Jamaica/epidemiologia , Malaui/epidemiologia , Masculino , Mucosa Bucal , Estudos Prospectivos , Estudos Retrospectivos , Desnutrição Aguda Grave/mortalidade , Sobreviventes , Adulto Jovem
10.
Bayesian Anal ; 14(2): 573-594, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31608133

RESUMO

Bayesian variable selection regression (BVSR) is able to jointly analyze genome-wide genetic datasets, but the slow computation via Markov chain Monte Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative method to solve a special class of linear systems, which can increase the speed of the BVSR model-fitting tenfold. The iterative method hinges on the complex factorization of the sum of two matrices and the solution path resides in the complex domain (instead of the real domain). Compared to the Gauss-Seidel method, the complex factorization converges almost instantaneously and its error is several magnitude smaller than that of the Gauss-Seidel method. More importantly, the error is always within the pre-specified precision while the Gauss-Seidel method is not. For large problems with thousands of covariates, the complex factorization is 10-100 times faster than either the Gauss-Seidel method or the direct method via the Cholesky decomposition. In BVSR, one needs to repetitively solve large penalized regression systems whose design matrices only change slightly between adjacent MCMC steps. This slight change in design matrix enables the adaptation of the iterative complex factorization method. The computational innovation will facilitate the wide-spread use of BVSR in reanalyzing genome-wide association datasets.

11.
Blood Adv ; 2(24): 3637-3647, 2018 12 26.
Artigo em Inglês | MEDLINE | ID: mdl-30578281

RESUMO

Red blood cell (RBC) transfusion remains a critical therapeutic intervention in sickle cell disease (SCD); however, the apparent propensity of some patients to regularly develop RBC alloantibodies after transfusion presents a significant challenge to finding compatible blood for so-called alloimmunization responders. Predisposing genetic loci have long been thought to contribute to the responder phenomenon, but to date, no definitive loci have been identified. We undertook a genome-wide association study of alloimmunization responder status in 267 SCD multiple transfusion recipients, using genetic estimates of ancestral admixture to bolster our findings. Analyses revealed single nucleotide polymorphisms (SNPs) on chromosomes 2 and 5 approaching genome-wide significance (minimum P = 2.0 × 10-8 and 8.4 × 10-8, respectively), with local ancestry analysis demonstrating similar levels of admixture in responders and nonresponders at implicated loci. Association at chromosome 5 was nominally replicated in an independent cohort of 130 SCD transfusion recipients, with meta-analysis surpassing genome-wide significance (rs75853687, P meta = 6.6 × 10-9), and this extended to individuals forming multiple (>3) alloantibodies (P meta = 9.4 × 10-5). The associated variant is rare outside of African populations, and orthogonal genome-wide haplotype analyses, contingent on local ancestry, revealed genome-wide significant sharing of a ∼60-kb haplotype of African ancestry at the chromosome 5 locus (Bayes Factor = 4.95). This locus overlaps a putative cis-acting enhancer predicted to regulate transcription of ADRA1B and the lncRNA LINC01847, both members of larger ontologies associated with immune regulation. Our findings provide potential insights to the pathophysiology underlying the development of alloantibodies and implicate non-RBC ancestry-limited loci in the susceptibility to alloimmunization.


Assuntos
Anemia Falciforme/patologia , Negro ou Afro-Americano/genética , Cromossomos Humanos Par 5/genética , Isoanticorpos/sangue , Alelos , Anemia Falciforme/genética , Anemia Falciforme/imunologia , Cromossomos Humanos Par 2/genética , Loci Gênicos , Estudo de Associação Genômica Ampla , Genótipo , Haplótipos , Humanos , Polimorfismo de Nucleotídeo Único , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Receptores Adrenérgicos alfa 1/genética , Receptores Adrenérgicos alfa 1/metabolismo
12.
J Am Stat Assoc ; 113(523): 1362-1371, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30386004

RESUMO

We show that under the null, the 2 log(Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and the normal prior. Our results have three immediate impacts. First, we can compute analytically a p-value associated with a Bayes factor without the need of permutation. We provide a software package that can evaluate the p-value associated with Bayes factor efficiently and accurately. Second, the null distribution is illuminating to some intrinsic properties of Bayes factor, namely, how Bayes factor quantitatively depends on prior and the genesis of Bartlett's paradox. Third, enlightened by the null distribution of Bayes factor, we formulate a novel scaled Bayes factor that depends less on the prior and is immune to Bartlett's paradox. When two tests have an identical p-value, the test with a larger power tends to have a larger scaled Bayes factor, a desirable property that is missing for the (unscaled) Bayes factor.

13.
J Theor Biol ; 455: 342-356, 2018 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-30053386

RESUMO

Chikungunya, dengue, and Zika viruses are all transmitted by Aedes aegypti and Aedes albopictus mosquito species, had been imported to Florida and caused local outbreaks. We propose a deterministic model to study the importation and local transmission of these mosquito-borne diseases. The purpose is to model and mimic the importation of these viruses to Florida via travelers, local infections in domestic mosquitoes by imported travelers, and finally non-travel related transmissions to local humans by infected local mosquitoes. As a case study, the model will be used to simulate the accumulative Zika cases in Florida. Since the disease system is driven by a continuing input of infections from outside sources, orthodox analytic methods based on the calculation of the basic reproduction number are inadequate to describe and predict their behavior. Via steady-state analysis and sensitivity analysis, effective control and prevention measures for these mosquito-borne diseases are tested.


Assuntos
Aedes/virologia , Surtos de Doenças , Modelos Biológicos , Mosquitos Vetores/virologia , Infecção por Zika virus , Zika virus , Animais , Febre de Chikungunya/epidemiologia , Febre de Chikungunya/transmissão , Vírus Chikungunya , Dengue/epidemiologia , Dengue/transmissão , Vírus da Dengue , Florida/epidemiologia , Humanos , Infecção por Zika virus/epidemiologia , Infecção por Zika virus/transmissão
14.
Genet Med ; 20(8): 817-824, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-29120459

RESUMO

PURPOSE: Noninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction. METHOD: Our Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values. RESULTS: Our Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives. CONCLUSION: Bayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.


Assuntos
Teorema de Bayes , Diagnóstico Pré-Natal/métodos , Análise de Sequência de DNA/métodos , Adulto , China , Feminino , Feto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Cadeias de Markov , Gravidez , Cuidado Pré-Natal
15.
Clin Rheumatol ; 36(8): 1819-1826, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28432524

RESUMO

The studies aimed to assess a set of biomarkers for their correlations with disease activity/severity of patients with ankylosing spondylitis (AS). A total of 24 AS patients were treated with etanercept and prospectively followed for 12 weeks. Serum levels of TNF-α, IFN-γ, TGF-ß, IL6, IL15, IL17, MMP3, and MICA were measured at baseline and after treatment. The change of these biomarkers was analyzed for correlations with MRI indices for joint inflammation, Bath Ankylosing Spondylitis Disease Activity Index, Bath Ankylosing Spondylitis Functional Index, AS Disease Activity Score, serum CRP, and ESR. The Wilcoxon rank sum test was used to compare the biomarker levels between pre- and post-treatment and between pre-treatment and controls. Both step-wise procedures based on the Akaike information criterion (AIC) and least absolute shrinkage and selection operator with fivefold cross-validation were used to select the best model for pairwise correlations between the above clinical measures and the serum biomarkers. Serum levels of both MMP3 and IL6 were significantly higher in AS patients at baseline. After treatment, the levels of MMP3 decreased, but TGF-ß and TNF-α increased significantly. The changes of serum MMP3 and MICA were significantly associated with MRI sacroiliac joint (SIJ) scores. CRP was positively correlated with serum MMP3 and IL6. The pattern of combined changes of serum MICA, MMP3, TGF-ß, IL17, TNF-α, and IFN-γ predicted the MRI score of SIJ by logistic regression analysis. Specific serum biomarkers were significantly associated with clinical measures of AS. Most prominently, serum MMP3 level was found to have a positive correlation with the MRI score of SIJ and CRP. Serum MICA level negatively correlated with disease remission.


Assuntos
Antirreumáticos/uso terapêutico , Etanercepte/uso terapêutico , Metaloproteinase 3 da Matriz/sangue , Espondilite Anquilosante/sangue , Espondilite Anquilosante/tratamento farmacológico , Adulto , Biomarcadores/sangue , Proteína C-Reativa/análise , Citocinas/sangue , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Projetos Piloto , Estudos Prospectivos , Articulação Sacroilíaca/diagnóstico por imagem , Índice de Gravidade de Doença , Espondilite Anquilosante/diagnóstico por imagem , Resultado do Tratamento , Adulto Jovem
16.
Biometrics ; 73(4): 1311-1320, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28369699

RESUMO

Applications of spatial point processes for large and complex data sets with inhomogeneities as encountered, example, in tropical rain forest ecology call for estimation methods that are both statistically and computationally efficient. We propose a novel second-order quasi-likelihood procedure to estimate the parameters for a second-order intensity reweighted stationary spatial point process. Our approach is to derive first- and second-order estimating functions and then combine them linearly using appropriate weight functions. In the stationary case, we argue that the asymptotically optimal weight functions are respectively a constant and a function of lags between distinct locations in the observation window. This leads to a considerable gain in computational efficiency. We further exploit this simplification in the nonstationary case. Simulations show that, when compared with several existing approaches, our method can achieve significant gains in statistical efficiency. An application to a tropical rain forest data set further illustrates the advantages of our procedure.


Assuntos
Biometria , Ecologia , Modelos Estatísticos , Algoritmos , Simulação por Computador , Floresta Úmida
17.
Nat Commun ; 7: 12065, 2016 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-27356984

RESUMO

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.


Assuntos
Povo Asiático/genética , Genoma Humano , Variação Estrutural do Genoma , Humanos , Masculino , Análise de Sequência de DNA , Análise de Sequência de RNA , Transcriptoma
18.
Stat Med ; 35(24): 4306-4319, 2016 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-27241902

RESUMO

Recurrent event data are quite common in biomedical and epidemiological studies. A significant portion of these data also contain additional longitudinal information on surrogate markers. Previous studies have shown that popular methods using a Cox model with longitudinal outcomes as time-dependent covariates may lead to biased results, especially when longitudinal outcomes are measured with error. Hence, it is important to incorporate longitudinal information into the analysis properly. To achieve this, we model the correlation between longitudinal and recurrent event processes using latent random effect terms. We then propose a two-stage conditional estimating equation approach to model the rate function of recurrent event process conditioned on the observed longitudinal information. The performance of our proposed approach is evaluated through simulation. We also apply the approach to analyze cocaine addiction data collected by the University of Connecticut Health Center. The data include recurrent event information on cocaine relapse and longitudinal cocaine craving scores. Copyright © 2016 John Wiley & Sons, Ltd.


Assuntos
Confiabilidade dos Dados , Estudos Longitudinais , Transtornos Relacionados ao Uso de Cocaína , Humanos , Recidiva
19.
PLoS Genet ; 12(2): e1005847, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26863142

RESUMO

Mexicans are a recent admixture of Amerindians, Europeans, and Africans. We performed local ancestry analysis of Mexican samples from two genome-wide association studies obtained from dbGaP, and discovered that at the MHC region Mexicans have excessive African ancestral alleles compared to the rest of the genome, which is the hallmark of recent selection for admixed samples. The estimated selection coefficients are 0.05 and 0.07 for two datasets, which put our finding among the strongest known selections observed in humans, namely, lactase selection in northern Europeans and sickle-cell trait in Africans. Using inaccurate Amerindian training samples was a major concern for the credibility of previously reported selection signals in Latinos. Taking advantage of the flexibility of our statistical model, we devised a model fitting technique that can learn Amerindian ancestral haplotype from the admixed samples, which allows us to infer local ancestries for Mexicans using only European and African training samples. The strong selection signal at the MHC remains without Amerindian training samples. Finally, we note that medical history studies suggest such a strong selection at MHC is plausible in Mexicans.


Assuntos
Pool Gênico , Complexo Principal de Histocompatibilidade/genética , Seleção Genética , População Negra/genética , Dosagem de Genes , Genealogia e Heráldica , Humanos , México , Análise de Componente Principal , População Branca/genética
20.
Stat Med ; 35(14): 2422-40, 2016 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-26790617

RESUMO

Spatiotemporal calibration of output from deterministic models is an increasingly popular tool to more accurately and efficiently estimate the true distribution of spatial and temporal processes. Current calibration techniques have focused on a single source of data on observed measurements of the process of interest that are both temporally and spatially dense. Additionally, these methods often calibrate deterministic models available in grid-cell format with pixel sizes small enough that the centroid of the pixel closely approximates the measurement for other points within the pixel. We develop a modeling strategy that allows us to simultaneously incorporate information from two sources of data on observed measurements of the process (that differ in their spatial and temporal resolutions) to calibrate estimates from a deterministic model available on a regular grid. This method not only improves estimates of the pollutant at the grid centroids but also refines the spatial resolution of the grid data. The modeling strategy is illustrated by calibrating and spatially refining daily estimates of ambient nitrogen dioxide concentration over Connecticut for 1994 from the Community Multiscale Air Quality model (temporally dense grid-cell estimates on a large pixel size) using observations from an epidemiologic study (spatially dense and temporally sparse) and Environmental Protection Agency monitoring stations (temporally dense and spatially sparse). Copyright © 2016 John Wiley & Sons, Ltd.


Assuntos
Modelos Estatísticos , Análise Espaço-Temporal , Poluentes Atmosféricos/análise , Poluição do Ar/análise , Poluição do Ar/estatística & dados numéricos , Bioestatística , Calibragem , Connecticut , Exposição Ambiental/análise , Exposição Ambiental/estatística & dados numéricos , Monitoramento Ambiental/estatística & dados numéricos , Humanos , Dióxido de Nitrogênio/análise , Estados Unidos , United States Environmental Protection Agency
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...